Overview

Brought to you by YData

Dataset statistics

Number of variables7
Number of observations43022
Missing cells0
Missing cells (%)0.0%
Duplicate rows177
Duplicate rows (%)0.4%
Total size in memory17.1 MiB
Average record size in memory417.6 B

Variable types

Numeric1
Text2
Categorical3
DateTime1

Alerts

Dataset has 177 (0.4%) duplicate rowsDuplicates
Event is highly overall correlated with SubsystemHigh correlation
Subsystem is highly overall correlated with EventHigh correlation
Unknown is highly imbalanced (82.2%) Imbalance

Reproduction

Analysis started2025-03-19 12:37:40.971366
Analysis finished2025-03-19 12:39:35.995389
Duration1 minute and 55.02 seconds
Software versionydata-profiling vv4.14.0
Download configurationconfig.json

Variables

ID
Real number (ℝ)

Distinct41739
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean460027.3
Minimum1
Maximum2616645
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size672.2 KiB
2025-03-19T14:39:36.272699image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile34428
Q1117920.75
median255915.5
Q3379962.5
95-th percentile2583662.9
Maximum2616645
Range2616644
Interquartile range (IQR)262041.75

Descriptive statistics

Standard deviation716577.04
Coefficient of variation (CV)1.5576837
Kurtosis4.6987662
Mean460027.3
Median Absolute Deviation (MAD)130051.5
Skewness2.5255952
Sum1.9791295 × 1010
Variance5.1348266 × 1011
MonotonicityNot monotonic
2025-03-19T14:39:36.349293image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
56279 4
 
< 0.1%
32716 3
 
< 0.1%
56375 3
 
< 0.1%
206869 3
 
< 0.1%
256673 3
 
< 0.1%
272781 3
 
< 0.1%
56340 3
 
< 0.1%
272743 3
 
< 0.1%
56365 3
 
< 0.1%
432918 3
 
< 0.1%
Other values (41729) 42991
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
7 1
< 0.1%
14 1
< 0.1%
19 1
< 0.1%
23 1
< 0.1%
51 1
< 0.1%
55 1
< 0.1%
56 1
< 0.1%
58 1
< 0.1%
59 1
< 0.1%
ValueCountFrequency (%)
2616645 1
< 0.1%
2616639 1
< 0.1%
2616635 1
< 0.1%
2616632 1
< 0.1%
2616623 1
< 0.1%
2616612 1
< 0.1%
2616606 1
< 0.1%
2616588 1
< 0.1%
2616556 1
< 0.1%
2616543 1
< 0.1%

Node
Text

Distinct727
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size3.1 MiB
2025-03-19T14:39:36.472244image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length17
Median length10
Mean length9.798173
Min length4

Characters and Unicode

Total characters421537
Distinct characters30
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique405 ?
Unique (%)0.9%

Sample

1st rownode-244
2nd rownode-244
3rd rownode-94
4th rowInterconnect-1N01
5th rowInterconnect-0T00
ValueCountFrequency (%)
gige7 4420
 
10.3%
interconnect-0n00 3059
 
7.1%
interconnect-1n01 2960
 
6.9%
gige6 2132
 
5.0%
gige3 1457
 
3.4%
interconnect-1t01 1404
 
3.3%
interconnect-1n00 1089
 
2.5%
gige4 971
 
2.3%
interconnect-0n03 926
 
2.2%
interconnect-1n03 902
 
2.1%
Other values (717) 23702
55.1%
2025-03-19T14:39:36.621049image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 58444
13.9%
e 55260
13.1%
- 31925
 
7.6%
o 31888
 
7.6%
0 26995
 
6.4%
t 26592
 
6.3%
c 26592
 
6.3%
1 23240
 
5.5%
g 20188
 
4.8%
d 18610
 
4.4%
Other values (20) 101803
24.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 421537
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 58444
13.9%
e 55260
13.1%
- 31925
 
7.6%
o 31888
 
7.6%
0 26995
 
6.4%
t 26592
 
6.3%
c 26592
 
6.3%
1 23240
 
5.5%
g 20188
 
4.8%
d 18610
 
4.4%
Other values (20) 101803
24.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 421537
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 58444
13.9%
e 55260
13.1%
- 31925
 
7.6%
o 31888
 
7.6%
0 26995
 
6.4%
t 26592
 
6.3%
c 26592
 
6.3%
1 23240
 
5.5%
g 20188
 
4.8%
d 18610
 
4.4%
Other values (20) 101803
24.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 421537
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 58444
13.9%
e 55260
13.1%
- 31925
 
7.6%
o 31888
 
7.6%
0 26995
 
6.4%
t 26592
 
6.3%
c 26592
 
6.3%
1 23240
 
5.5%
g 20188
 
4.8%
d 18610
 
4.4%
Other values (20) 101803
24.2%

Subsystem
Categorical

High correlation 

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.0 MiB
switch_module
13278 
node
11144 
gige
10094 
unix.hw
2992 
action
2709 
Other values (8)
2805 

Length

Max length17
Median length14
Mean length7.7144484
Min length4

Characters and Unicode

Total characters331891
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rownode
2nd rownode
3rd rownode
4th rowswitch_module
5th rowswitch_module

Common Values

ValueCountFrequency (%)
switch_module 13278
30.9%
node 11144
25.9%
gige 10094
23.5%
unix.hw 2992
 
7.0%
action 2709
 
6.3%
clusterfilesystem 1579
 
3.7%
partition 581
 
1.4%
boot_cmd 403
 
0.9%
domain 150
 
0.3%
shutdown_cmd 51
 
0.1%
Other values (3) 41
 
0.1%

Length

2025-03-19T14:39:36.660880image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
switch_module 13278
30.9%
node 11144
25.9%
gige 10094
23.5%
unix.hw 2992
 
7.0%
action 2709
 
6.3%
clusterfilesystem 1579
 
3.7%
partition 581
 
1.4%
boot_cmd 403
 
0.9%
domain 150
 
0.3%
shutdown_cmd 51
 
0.1%
Other values (3) 41
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e 39337
11.9%
i 31965
 
9.6%
o 28719
 
8.7%
d 25086
 
7.6%
t 20798
 
6.3%
g 20188
 
6.1%
s 18103
 
5.5%
c 18025
 
5.4%
u 17900
 
5.4%
n 17628
 
5.3%
Other values (14) 94142
28.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 331891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 39337
11.9%
i 31965
 
9.6%
o 28719
 
8.7%
d 25086
 
7.6%
t 20798
 
6.3%
g 20188
 
6.1%
s 18103
 
5.5%
c 18025
 
5.4%
u 17900
 
5.4%
n 17628
 
5.3%
Other values (14) 94142
28.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 331891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 39337
11.9%
i 31965
 
9.6%
o 28719
 
8.7%
d 25086
 
7.6%
t 20798
 
6.3%
g 20188
 
6.1%
s 18103
 
5.5%
c 18025
 
5.4%
u 17900
 
5.4%
n 17628
 
5.3%
Other values (14) 94142
28.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 331891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 39337
11.9%
i 31965
 
9.6%
o 28719
 
8.7%
d 25086
 
7.6%
t 20798
 
6.3%
g 20188
 
6.1%
s 18103
 
5.5%
c 18025
 
5.4%
u 17900
 
5.4%
n 17628
 
5.3%
Other values (14) 94142
28.4%

Event
Categorical

High correlation 

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.0 MiB
temperature
15525 
error
10583 
status
6420 
start
2639 
net.niff.up
2293 
Other values (15)
5562 

Length

Max length28
Median length27
Mean length8.5757752
Min length3

Characters and Unicode

Total characters368947
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowstatus
2nd rowtemperature
3rd rowstatus
4th rowerror
5th rowbcast-error

Common Values

ValueCountFrequency (%)
temperature 15525
36.1%
error 10583
24.6%
status 6420
14.9%
start 2639
 
6.1%
net.niff.up 2293
 
5.3%
fan 2246
 
5.2%
clusterfilesystem.not_served 834
 
1.9%
clusterfilesystem.no_server 489
 
1.1%
bcast-error 391
 
0.9%
state_change.unavailable 310
 
0.7%
Other values (10) 1292
 
3.0%

Length

2025-03-19T14:39:36.694379image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
temperature 15525
36.1%
error 10583
24.6%
status 6420
14.9%
start 2639
 
6.1%
net.niff.up 2293
 
5.3%
fan 2246
 
5.2%
clusterfilesystem.not_served 834
 
1.9%
clusterfilesystem.no_server 489
 
1.1%
bcast-error 391
 
0.9%
state_change.unavailable 310
 
0.7%
Other values (10) 1292
 
3.0%

Most occurring characters

ValueCountFrequency (%)
r 69831
18.9%
e 68835
18.7%
t 56804
15.4%
a 30198
8.2%
u 26356
 
7.1%
s 22334
 
6.1%
p 17982
 
4.9%
m 17204
 
4.7%
o 12546
 
3.4%
n 9928
 
2.7%
Other values (14) 36929
10.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 368947
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r 69831
18.9%
e 68835
18.7%
t 56804
15.4%
a 30198
8.2%
u 26356
 
7.1%
s 22334
 
6.1%
p 17982
 
4.9%
m 17204
 
4.7%
o 12546
 
3.4%
n 9928
 
2.7%
Other values (14) 36929
10.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 368947
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r 69831
18.9%
e 68835
18.7%
t 56804
15.4%
a 30198
8.2%
u 26356
 
7.1%
s 22334
 
6.1%
p 17982
 
4.9%
m 17204
 
4.7%
o 12546
 
3.4%
n 9928
 
2.7%
Other values (14) 36929
10.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 368947
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r 69831
18.9%
e 68835
18.7%
t 56804
15.4%
a 30198
8.2%
u 26356
 
7.1%
s 22334
 
6.1%
p 17982
 
4.9%
m 17204
 
4.7%
o 12546
 
3.4%
n 9928
 
2.7%
Other values (14) 36929
10.0%
Distinct36213
Distinct (%)84.2%
Missing0
Missing (%)0.0%
Memory size672.2 KiB
Minimum2003-12-26 12:36:59
Maximum2006-04-30 09:58:24
Invalid dates0
Invalid dates (%)0.0%
2025-03-19T14:39:36.732769image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T14:39:36.779297image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Unknown
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
1
41872 
0
 
1150

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters43022
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 41872
97.3%
0 1150
 
2.7%

Length

2025-03-19T14:39:36.819639image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-19T14:39:36.844910image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
1 41872
97.3%
0 1150
 
2.7%

Most occurring characters

ValueCountFrequency (%)
1 41872
97.3%
0 1150
 
2.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43022
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 41872
97.3%
0 1150
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43022
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 41872
97.3%
0 1150
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43022
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 41872
97.3%
0 1150
 
2.7%
Distinct3792
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
2025-03-19T14:39:36.949058image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length811
Median length762
Mean length26.602854
Min length4

Characters and Unicode

Total characters1144508
Distinct characters67
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2013 ?
Unique (%)4.7%

Sample

1st rowrunning
2nd rowambient=30
3rd rowconfigured out
4th rowLinkerror event interval expired
5th rowLink error
ValueCountFrequency (%)
linkerror 8492
 
5.3%
interval 8492
 
5.3%
expired 8492
 
5.3%
event 8492
 
5.3%
6505
 
4.0%
warning 5063
 
3.1%
network 4808
 
3.0%
normal 4680
 
2.9%
node 3248
 
2.0%
command 3094
 
1.9%
Other values (2252) 99707
61.9%
2025-03-19T14:39:37.119309image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 125056
 
10.9%
120737
 
10.5%
n 107316
 
9.4%
r 87619
 
7.7%
i 67255
 
5.9%
o 59554
 
5.2%
t 58815
 
5.1%
a 58652
 
5.1%
d 34355
 
3.0%
l 30867
 
2.7%
Other values (57) 394282
34.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1144508
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 125056
 
10.9%
120737
 
10.5%
n 107316
 
9.4%
r 87619
 
7.7%
i 67255
 
5.9%
o 59554
 
5.2%
t 58815
 
5.1%
a 58652
 
5.1%
d 34355
 
3.0%
l 30867
 
2.7%
Other values (57) 394282
34.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1144508
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 125056
 
10.9%
120737
 
10.5%
n 107316
 
9.4%
r 87619
 
7.7%
i 67255
 
5.9%
o 59554
 
5.2%
t 58815
 
5.1%
a 58652
 
5.1%
d 34355
 
3.0%
l 30867
 
2.7%
Other values (57) 394282
34.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1144508
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 125056
 
10.9%
120737
 
10.5%
n 107316
 
9.4%
r 87619
 
7.7%
i 67255
 
5.9%
o 59554
 
5.2%
t 58815
 
5.1%
a 58652
 
5.1%
d 34355
 
3.0%
l 30867
 
2.7%
Other values (57) 394282
34.4%

Interactions

2025-03-19T14:37:42.610451image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-03-19T14:39:40.252324image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
EventIDSubsystemUnknown
Event1.0000.0850.6640.387
ID0.0851.0000.1140.000
Subsystem0.6640.1141.0000.273
Unknown0.3870.0000.2731.000

Missing values

2025-03-19T14:39:35.858911image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-03-19T14:39:35.911565image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

IDNodeSubsystemEventTimestampUnknownMessage
142250401341node-244nodestatus2006-03-16 18:21:490running
267028160244node-244nodetemperature2005-10-29 07:27:001ambient=30
12039918237node-94nodestatus2004-11-18 13:22:001configured out
343592174111Interconnect-1N01switch_moduleerror2004-02-27 14:43:501Linkerror event interval expired
65565115375Interconnect-0T00switch_modulebcast-error2004-02-26 05:33:251Link error
21780892432node-197nodetemperature2005-03-03 18:36:001ambient=31
836922596014node-109nodestatus2004-01-16 21:36:001configured out
1482962606804gige7gigetemperature2004-01-17 15:14:071warning
179027394103gige7gigetemperature2004-06-30 15:32:451warning
130695284681node-172nodestatus2005-07-06 20:13:371configured out
IDNodeSubsystemEventTimestampUnknownMessage
3037852561163Interconnect-1N01switch_moduleerror2004-01-06 03:35:471Link in reset
15151431411gige7gigetemperature2004-02-09 10:55:591warning
225548140243gige4gigetemperature2005-03-30 10:49:111normal
3213792606943Interconnect-1N01switch_moduleerror2004-01-17 22:56:121Linkerror event interval expired
352479262162Interconnect-1N01switch_moduleerror2004-03-14 01:16:391Linkerror event interval expired
25627290975gige6gigetemperature2005-09-30 18:21:411warning
164176316277gige7gigetemperature2004-04-23 18:37:091warning
246194327030gige4gigetemperature2005-08-06 16:58:061warning
35611127486Interconnect-0N01switch_moduletemphigh2005-03-26 10:12:001Temperature (42C) exceeds warning threshold
409194139144node-18unix.hwnet.niff.up2004-02-26 15:43:561NIFF: node node-18 has detected an available network connection on network 5.5.226.0 via interface alt0

Duplicate rows

Most frequently occurring

IDNodeSubsystemEventTimestampUnknownMessage# duplicates
32256707node-161nodestatus2005-06-16 16:02:450running3
89272478node-86nodestatus2004-03-18 13:08:560not responding3
145413291node-238nodestatus2004-08-12 14:21:390configured out3
17156340node-188nodestatus2005-09-15 20:09:300running3
17256375node-225nodestatus2005-09-15 20:09:300running3
0112848node-109nodetemperature2005-10-09 11:25:301ambient=312
11636563910boot_cmdsuccess2005-10-30 11:31:171Command has completed successfully2
224160node-226nodestatus2004-11-18 14:35:490running2
3251011node-72nodestatus2005-06-16 12:41:380not responding2
4251070node-100nodestatus2005-06-16 12:43:300configured out2